On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems — Appendix

نویسندگان

Stefano V. Albrecht

Subramanian Ramamoorthy

چکیده

Proof. Kalai and Lehrer [2] studied a model which can be equivalently described as a single-state SBG (i.e. |S| = 1) with a pure type distribution and product posterior. They showed that, if the player’s assessment of future play is absolutely continuous with respect to the true probabilities of future play (i.e. any event that has true positive probability is assigned positive probability by the player), then (1) must hold. In our case, absolute continuity always holds by Assumption 5 and the fact that the prior probabilities Pj are positive, as well as the fact that the type distribution is pure (from which we can infer that the true types always have positive posterior probability). In this proof, we seek to extend the convergence result of Kalai and Lehrer (henceforth [2]) to multi-state SBGs with pure type distributions. Our strategy is to translate a SBG Γ into a modified SBG Γ̂ which is equivalent to Γ in the sense that the players behave identically, and which is

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems

While many multiagent algorithms are designed for homogeneous systems (i.e. all agents are identical), there are important applications which require an agent to coordinate its actions without knowing a priori how the other agents behave. One method to make this problem feasible is to assume that the other agents draw their latent policy (or type) from a specific set, and that a domain expert c...

متن کامل

Unifying Convergence and No-Regret in Multiagent Learning

We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...

متن کامل

Convergence, Targeted Optimality, and Safety in Multiagent Learning

This paper introduces a novel multiagent learning algorithm, Convergence with Model Learning and Safety (or CMLeS in short), which achieves convergence, targeted optimality against memory-bounded adversaries, and safety, in arbitrary repeated games. The most novel aspect of CMLeS is the manner in which it guarantees (in a PAC sense) targeted optimality against memory-bounded adversaries, via ef...

متن کامل

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...

متن کامل

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

Reinforcement learning turned out a technique that allowed robots to ride a bicycle, computers to play backgammon on the level of human world masters and solve such complicated tasks of high dimensionality as elevator dispatching. Can it come to rescue in the next generation of challenging problems like playing football or bidding on virtual markets? Reinforcement learning that provides a way o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems — Appendix

نویسندگان

چکیده

منابع مشابه

On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems

Unifying Convergence and No-Regret in Multiagent Learning

Convergence, Targeted Optimality, and Safety in Multiagent Learning

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

عنوان ژورنال:

اشتراک گذاری